A comparison of training approaches for discriminative segmental models
نویسندگان
چکیده
Segmental models such as segmental conditional random fields have had some recent success in lattice rescoring for speech recognition. They provide a flexible framework for incorporating a wide range of features across different levels of units, such as phones and words. However, such models have mainly been trained by maximizing conditional likelihood, which may not be the best proxy for the task loss of speech recognition. In addition, there has been little work on designing cost functions as surrogates for the word error rate. In this paper, we investigate various losses and introduce a new cost function for training segmental models. We compare lattice rescoring results for multiple tasks and also study the impact of several choices required when optimizing these losses.
منابع مشابه
Lattice segmentation and minimum Bayes risk discriminative training
Modeling approaches are presented that incorporate discriminative training procedures in segmental Minimum Bayes-Risk decoding (SMBR). SMBR is used to segment lattices produced by a general automatic speech recognition (ASR) system into sequences of separate decision problems involving small sets of confusable words. We discuss two approaches to incorporating these segmented lattices in discrim...
متن کاملDiscriminative training for segmental minimum Bayes risk decoding
A modeling approach is presented that incorporates discriminative training procedures within segmental Minimum Bayes-Risk decoding (SMBR). SMBR is used to segment lattices produced by a general automatic speech recognition (ASR) system into sequences of separate decision problems involving small sets of confusable words. Acoustic models specialized to discriminate between the competing words in...
متن کاملEfficient Segmental Cascades for Speech Recognition
Discriminative segmental models offer a way to incorporate flexible feature functions into speech recognition. However, their appeal has been limited by their computational requirements, due to the large number of possible segments to consider. Multi-pass cascades of segmental models introduce features of increasing complexity in different passes, where in each pass a segmental model rescores l...
متن کاملGenerative Kernels and Score-Spaces for Classication of Speech: Progress Report iii
May is is the third and nal progress report for Project /// (Generative Kernels and Score Spaces for Classiication of Speech) within the Global Uncertainties Programme. is project combines the current generative models developed in the speech community with discriminative classiiers. An important aspect of the approach is that the generative models are used to deene a score-space that can be us...
متن کاملImproved performance and generalization of minimum classification error training for continuous speech recognition
Discriminative training of hidden Markov models (HMMs) using segmental minimum classi cation error (MCE) training has been shown to work extremely well for certain speech recognition applications. It is, however, somewhat prone to overspecialization. This study investigates various techniques which improve performance and generalization of the MCE algorithm. Improvements of up to 7% in relative...
متن کامل